Notebook

Numerical operations with Numpy¶

Elementwise operations
Basic reductions
Broadcasting
Array shape manipulation
Sorting data
Summary

Elementwise Operations¶

Basic Stuff¶

With Scalars¶

In [1]:

importnumpyasnp

In [2]:

a=np.array([1,2,3,4])a+1

Out[2]:

array([2, 3, 4, 5])

In [3]:

2**a

Out[3]:

array([ 2, 4, 8, 16])

All arithmetics element-wise operation¶

In [4]:

b=np.ones(4)+1a-b

Out[4]:

array([-1., 0., 1., 2.])

In [5]:

a*b

Out[5]:

array([2., 4., 6., 8.])

In [6]:

j=np.arange(10)2**(j+1)-j

Out[6]:

array([ 2, 3, 6, 13, 28, 59, 122, 249, 504, 1015])

... and lets see how fast is Numpy compared to Python's in-built operations¶

In [7]:

a=np.arange(10000)%timeit a + 1 

4.09 µs ± 154 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [8]:

p=range(10000)%timeit [i + 1 for i in p] 

600 µs ± 55.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Clearly, Numpy is winning hands down!¶

Another thing about multiplication * VS matrix multiplication

In [9]:

c=np.ones((4,4))c

Out[9]:

array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]])

In [10]:

c*c

Out[10]:

array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]])

In [11]:

np.dot(c,c)

Out[11]:

array([[4., 4., 4., 4.], [4., 4., 4., 4.], [4., 4., 4., 4.], [4., 4., 4., 4.]])

In [12]:

# Or you can simply doc.dot(c)

Out[12]:

array([[4., 4., 4., 4.], [4., 4., 4., 4.], [4., 4., 4., 4.], [4., 4., 4., 4.]])

More operations¶

Comparisions¶

In [13]:

a=np.array([1,2,3,4])b=np.array([5,2,6,4])c=np.array([1,2,3,4])

In [14]:

a==b

Out[14]:

array([False, True, False, True])

array-wise comparisons¶

In [15]:

np.array_equal(a,b)

Out[15]:

False

In [16]:

np.array_equal(a,c)

Out[16]:

True

Logical operations¶

In [17]:

a=np.array([1,1,0,0],dtype=bool)b=np.array([1,0,1,0],dtype=bool)np.logical_or(a,b)

Out[17]:

array([ True, True, True, False])

In [18]:

np.logical_and(a,b)

Out[18]:

array([ True, False, False, False])

In [19]:

np.logical_not(a,b)

Out[19]:

array([False, False, True, True])

In [20]:

np.logical_xor(a,b)

Out[20]:

array([ True, True, True, True])

Trancendentaal functions:¶

In [21]:

a=np.arange(5)np.sin(a)

Out[21]:

array([ 0. , 0.84147098, 0.90929743, 0.14112001, -0.7568025 ])

In [22]:

np.log(a)

<ipython-input-22-89b6b8e53c58>:1: RuntimeWarning: divide by zero encountered in log np.log(a)

Out[22]:

array([ -inf, 0. , 0.69314718, 1.09861229, 1.38629436])

In [23]:

np.exp(a)

Out[23]:

array([ 1. , 2.71828183, 7.3890561 , 20.08553692, 54.59815003])

Shape mismatches:¶

Throws a broacasting error. We'll get to that soon enough...

In [24]:

a=np.arange(5)a+np.array([2,4])

---------------------------------------------------------------------------ValueError Traceback (most recent call last) <ipython-input-24-fcc734eff5a6> in <module> 1 a = np.arange(5)----> 2a + np.array([2,4])ValueError: operands could not be broadcast together with shapes (5,) (2,)

Transposition. what is it?¶

In [25]:

a=np.triu(np.ones((4,4)),1)a

Out[25]:

array([[0., 1., 1., 1.], [0., 0., 1., 1.], [0., 0., 0., 1.], [0., 0., 0., 0.]])

In [26]:

a.T

Out[26]:

array([[0., 0., 0., 0.], [1., 0., 0., 0.], [1., 1., 0., 0.], [1., 1., 1., 0.]])

See, what happened above?¶

It flipped upside-down and then right-to-left

Food for thought¶

Look at np.allclose versus np.isclose. What is the difference between the two according to you?
What is the difference between np.trui and np.tril?
Also play with Ranges: How is np.linspace different than np.logspace?

Basic reductions¶

Calculating sums¶

In [27]:

x=np.array([5,6,7,8])np.sum(x)

Out[27]:

Adding by rows and by columns¶

In [28]:

x=np.array([[2,2,],[6,6]])x

Out[28]:

array([[2, 2], [6, 6]])

In [29]:

x.sum(axis=0)#column-wise addition, first dimension

Out[29]:

array([8, 8])

In [30]:

# more complex way to doing it, but you get the ideax[:,0].sum(),x[:,1].sum()

Out[30]:

(8, 8)

In [31]:

# Le's do row-wisex.sum(axis=1)

Out[31]:

array([ 4, 12])

In [32]:

x[0,:].sum(),x[1,:].sum()

Out[32]:

(4, 12)

In [33]:

# In higher dimensionsx=np.random.rand(2,2,2)x

Out[33]:

array([[[0.34592414, 0.15460061], [0.8683121 , 0.68157837]], [[0.23800664, 0.93651685], [0.9379463 , 0.69130543]]])

In [34]:

x.sum(axis=2)[0,1]

Out[34]:

1.5498904649649252

In [35]:

# inspecting it more in detailx[0,1,:].sum()

Out[35]:

1.5498904649649252

Other forms of reductions¶

Extrema¶

In [36]:

x=np.array([2,5,6])x.min()

Out[36]:

In [37]:

x.max()

Out[37]:

In [38]:

x.argmin()# Index of mimimum

Out[38]:

In [39]:

x.argmax()# index of maximum

Out[39]:

Logic Functions for Truth Value Testing¶

Numpy.all

all(a[, axis, out, keepdims]) Test whether all array elements along a given axis evaluate to True.

Try help(numpy.all) for more details.

In [40]:

np.all([True,True,False])

Out[40]:

False

In [41]:

np.all([[True,True],[False,True]])

Out[41]:

False

In [42]:

np.all([[True,True],[False,True]],axis=0)#axis=1 should give you the opposite result as you'll see

Out[42]:

array([False, True])

In [43]:

np.all([-5,4,7])

Out[43]:

True

In [44]:

np.all([1.0,np.nan])# We'll get to np.nan - Not a number i.e; later

Out[44]:

True

In [45]:

c=np.array([False])d=np.all([-1,4,5],out=c,keepdims=True)id(d),id(d),c

Out[45]:

(140591000095120, 140591000095120, array([ True]))

Try the above operation with keepdims¶

What did you encounter? And why?

Numpy.any

any(a[, axis, out, keepdims]) Test whether any array element along a given axis evaluates to True.

In [46]:

np.any([True,True,False])

Out[46]:

True

In [47]:

np.any([[True,False],[True,True]])

Out[47]:

True

In [48]:

np.any([[True,False],[True,True]],axis=0)

Out[48]:

array([ True, True])

In [49]:

np.any([-1,0,7])

Out[49]:

True

In [50]:

np.any(np.nan)

Out[50]:

True

In [51]:

c=np.array([False])d=np.any([-1,4,5],out=c,keepdims=True)id(d),id(d),c

Out[51]:

(140591000888144, 140591000888144, array([ True]))

In [52]:

disc

Out[52]:

True

In [53]:

id(d),id(c)

Out[53]:

(140591000888144, 140591000888144)

In [54]:

Out[54]:

array([ True])

In [55]:

# Can we try array comparisons?a=np.zeros((200,200))

In [56]:

np.any(a!=0)

Out[56]:

False

In [57]:

np.all(a==a)

Out[57]:

True

In [58]:

a=np.array([3,4,5,6])b=np.array([4,5,6,7])c=np.array([6,7,3,2])((a<=b)&(b<=c)).all()# np.all(((a <= b) & (b <= c)))

Out[58]:

False

In [59]:

a<=b

Out[59]:

array([ True, True, True, True])

In [60]:

b<=c

Out[60]:

array([ True, True, False, False])

If we would have done `np.any()`...¶

What would then the answer be then?

In [61]:

((a<=b)&(b<=c)).any()

Out[61]:

True

Finite and Infinite¶

Test element-wise for finiteness and infinity

In [62]:

np.isfinite(1)# 1 is finite

Out[62]:

True

In [63]:

np.isfinite(np.exp(100))

Out[63]:

True

In [64]:

np.isfinite(np.exp(1000))

<ipython-input-64-5e887c69d6a8>:1: RuntimeWarning: overflow encountered in exp np.isfinite(np.exp(1000))

Out[64]:

False

In [65]:

np.exp(1000)

<ipython-input-65-47a6eab891c2>:1: RuntimeWarning: overflow encountered in exp np.exp(1000)

Out[65]:

inf

Above is a limitation of my computer...¶

The answer should actually be true...

...but if we do `np.isinf()'

In [66]:

np.isinf(np.exp(1000))

<ipython-input-66-b70189efb4d4>:1: RuntimeWarning: overflow encountered in exp np.isinf(np.exp(1000))

Out[66]:

True

In [67]:

np.isfinite(np.inf)# Kind of obvious as infinity is definitely NOT finite

Out[67]:

False

In [68]:

np.isfinite([np.log(-1.),1.,np.log(0)])# Trying same but in a different way...

<ipython-input-68-ee6f8431e4ab>:1: RuntimeWarning: invalid value encountered in log np.isfinite([np.log(-1.), 1., np.log(0)]) # Trying same but in a different way... <ipython-input-68-ee6f8431e4ab>:1: RuntimeWarning: divide by zero encountered in log np.isfinite([np.log(-1.), 1., np.log(0)]) # Trying same but in a different way...

Out[68]:

array([False, True, False])

How about `np.NINF`¶

In [69]:

help(np.NINF)

Help on float object: class float(object) | float(x=0, /) | | Convert a string or number to a floating point number, if possible. | | Methods defined here: | | __abs__(self, /) | abs(self) | | __add__(self, value, /) | Return self+value. | | __bool__(self, /) | self != 0 | | __divmod__(self, value, /) | Return divmod(self, value). | | __eq__(self, value, /) | Return self==value. | | __float__(self, /) | float(self) | | __floordiv__(self, value, /) | Return self//value. | | __format__(self, format_spec, /) | Formats the float according to format_spec. | | __ge__(self, value, /) | Return self>=value. | | __getattribute__(self, name, /) | Return getattr(self, name). | | __getnewargs__(self, /) | | __gt__(self, value, /) | Return self>value. | | __hash__(self, /) | Return hash(self). | | __int__(self, /) | int(self) | | __le__(self, value, /) | Return self<=value. | | __lt__(self, value, /) | Return self<value. | | __mod__(self, value, /) | Return self%value. | | __mul__(self, value, /) | Return self*value. | | __ne__(self, value, /) | Return self!=value. | | __neg__(self, /) | -self | | __pos__(self, /) | +self | | __pow__(self, value, mod=None, /) | Return pow(self, value, mod). | | __radd__(self, value, /) | Return value+self. | | __rdivmod__(self, value, /) | Return divmod(value, self). | | __repr__(self, /) | Return repr(self). | | __rfloordiv__(self, value, /) | Return value//self. | | __rmod__(self, value, /) | Return value%self. | | __rmul__(self, value, /) | Return value*self. | | __round__(self, ndigits=None, /) | Return the Integral closest to x, rounding half toward even. | | When an argument is passed, work like built-in round(x, ndigits). | | __rpow__(self, value, mod=None, /) | Return pow(value, self, mod). | | __rsub__(self, value, /) | Return value-self. | | __rtruediv__(self, value, /) | Return value/self. | | __sub__(self, value, /) | Return self-value. | | __truediv__(self, value, /) | Return self/value. | | __trunc__(self, /) | Return the Integral closest to x between 0 and x. | | as_integer_ratio(self, /) | Return integer ratio. | | Return a pair of integers, whose ratio is exactly equal to the original float | and with a positive denominator. | | Raise OverflowError on infinities and a ValueError on NaNs. | | >>> (10.0).as_integer_ratio() | (10, 1) | >>> (0.0).as_integer_ratio() | (0, 1) | >>> (-.25).as_integer_ratio() | (-1, 4) | | conjugate(self, /) | Return self, the complex conjugate of any float. | | hex(self, /) | Return a hexadecimal representation of a floating-point number. | | >>> (-0.1).hex() | '-0x1.999999999999ap-4' | >>> 3.14159.hex() | '0x1.921f9f01b866ep+1' | | is_integer(self, /) | Return True if the float is an integer. | | ---------------------------------------------------------------------- | Class methods defined here: | | __getformat__(typestr, /) from builtins.type | You probably don't want to use this function. | | typestr | Must be 'double' or 'float'. | | It exists mainly to be used in Python's test suite. | | This function returns whichever of 'unknown', 'IEEE, big-endian' or 'IEEE, | little-endian' best describes the format of floating point numbers used by the | C type named by typestr. | | __set_format__(typestr, fmt, /) from builtins.type | You probably don't want to use this function. | | typestr | Must be 'double' or 'float'. | fmt | Must be one of 'unknown', 'IEEE, big-endian' or 'IEEE, little-endian', | and in addition can only be one of the latter two if it appears to | match the underlying C reality. | | It exists mainly to be used in Python's test suite. | | Override the automatic determination of C-level floating point type. | This affects how floats are converted to and from binary strings. | | fromhex(string, /) from builtins.type | Create a floating-point number from a hexadecimal string. | | >>> float.fromhex('0x1.ffffp10') | 2047.984375 | >>> float.fromhex('-0x1p-1074') | -5e-324 | | ---------------------------------------------------------------------- | Static methods defined here: | | __new__(*args, **kwargs) from builtins.type | Create and return a new object. See help(type) for accurate signature. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | imag | the imaginary part of a complex number | | real | the real part of a complex number

In [70]:

np.isfinite(np.NINF)

Out[70]:

False

In [71]:

np.isfinite(np.nan)# Remember, NaN = Not is number

Out[71]:

False

In [72]:

x=np.array([-np.inf,0.,np.inf])y=np.array([2,2,2])np.isfinite(x,y)

Out[72]:

array([0, 1, 0])

In [73]:

Out[73]:

array([-inf, 0., inf])

In [74]:

Out[74]:

array([0, 1, 0])

Statistics - A quick brush up¶

In [75]:

a=np.array([5,6,7,8])b=np.array([[1,2,3],[4,5,6]])

In [76]:

# Doing a simple pythonic meana.mean()

Out[76]:

6.5

In [77]:

np.mean(a)# Numpy gives the same value

Out[77]:

6.5

In [78]:

np.median(a)# OK, same

Out[78]:

6.5

In [79]:

# Let's do for bnp.median(b)

Out[79]:

3.5

In [80]:

# For last axisnp.median(b,axis=-1)

Out[80]:

array([2., 5.])

In [81]:

# for second last axisnp.median(b,axis=-2)

Out[81]:

array([2.5, 3.5, 4.5])

Standard deviation¶

In [82]:

a.std()

Out[82]:

1.118033988749895

In [83]:

b.std()

Out[83]:

1.707825127659933

In [84]:

np.std(a)

Out[84]:

1.118033988749895

In [85]:

c=np.array([[4,5],[8,9]])np.std(c)

Out[85]:

2.0615528128088303

In [86]:

Out[86]:

array([[4, 5], [8, 9]])

In [87]:

np.std(c,axis=0)

Out[87]:

array([2., 2.])

In [88]:

np.std(c,axis=-1)

Out[88]:

array([0.5, 0.5])

In [89]:

np.std(c,axis=1)

Out[89]:

array([0.5, 0.5])

But standard can also be inaccurate...¶

In [90]:

c=np.zeros((2,256*256),dtype=np.float16)

In [91]:

c[0,:]=1.0

In [92]:

c[1,:]=0.1

In [93]:

np.std(d)

Out[93]:

0.0

What happened up there???¶

Now doing same with float32...

In [94]:

np.std(c,dtype=np.float32)

Out[94]:

0.4500123

How about `np.float64`?¶

In [95]:

np.std(c,dtype=np.float64)

Out[95]:

0.45001220703125

`np.float128`?¶

In [96]:

np.std(c,dtype=np.float128)# as you can see the accuracy doesnt change as you increasing floating point.

Out[96]:

0.45001220703125

Working with some data¶

Let's look at the population of hares, lynxes and carrots in Northern Canada for a period of 20 years.

In [97]:

# Let's view the data!cat data/populations.txt

# year hare lynx carrot 1900 30e3 4e3 48300 1901 47.2e3 6.1e3 48200 1902 70.2e3 9.8e3 41500 1903 77.4e3 35.2e3 38200 1904 36.3e3 59.4e3 40600 1905 20.6e3 41.7e3 39800 1906 18.1e3 19e3 38600 1907 21.4e3 13e3 42300 1908 22e3 8.3e3 44500 1909 25.4e3 9.1e3 42100 1910 27.1e3 7.4e3 46000 1911 40.3e3 8e3 46800 1912 57e3 12.3e3 43800 1913 76.6e3 19.5e3 40900 1914 52.3e3 45.7e3 39400 1915 19.5e3 51.1e3 39000 1916 11.2e3 29.7e3 36700 1917 7.6e3 15.8e3 41800 1918 14.6e3 9.7e3 43300 1919 16.2e3 10.1e3 41300 1920 24.7e3 8.6e3 47300

In [98]:

# load the datadata=np.loadtxt('data/populations.txt')

In [99]:

data

Out[99]:

array([[ 1900., 30000., 4000., 48300.], [ 1901., 47200., 6100., 48200.], [ 1902., 70200., 9800., 41500.], [ 1903., 77400., 35200., 38200.], [ 1904., 36300., 59400., 40600.], [ 1905., 20600., 41700., 39800.], [ 1906., 18100., 19000., 38600.], [ 1907., 21400., 13000., 42300.], [ 1908., 22000., 8300., 44500.], [ 1909., 25400., 9100., 42100.], [ 1910., 27100., 7400., 46000.], [ 1911., 40300., 8000., 46800.], [ 1912., 57000., 12300., 43800.], [ 1913., 76600., 19500., 40900.], [ 1914., 52300., 45700., 39400.], [ 1915., 19500., 51100., 39000.], [ 1916., 11200., 29700., 36700.], [ 1917., 7600., 15800., 41800.], [ 1918., 14600., 9700., 43300.], [ 1919., 16200., 10100., 41300.], [ 1920., 24700., 8600., 47300.]])

In [100]:

year,hares,lynxes,carrots=data.T# What did I just do and why?

In [101]:

importmatplotlib.pyplotasplt%matplotlib inline plt.axes([0.2,0.1,0.5,0.8])plt.plot(year,hares,year,lynxes,year,carrots)plt.legend(('Hare','Lynx','Carrot'),loc=(1.05,0.5))

Out[101]:

<matplotlib.legend.Legend at 0x7fdde5c79430>

So what was mean (median) population over time?¶

In [102]:

pop=data[:,1:]pop.mean(axis=0)

Out[102]:

array([34080.95238095, 20166.66666667, 42400. ])

In [103]:

pop.std(axis=0)

Out[103]:

array([20897.90645809, 16254.59153691, 3322.50622558])

Which animal had the highest population each year?¶

In [104]:

np.argmax(pop,axis=1)

Out[104]:

array([2, 2, 0, 0, 1, 1, 2, 2, 2, 2, 2, 2, 0, 0, 0, 1, 2, 2, 2, 2, 2])

Diffusion using a random walk algorithm¶

Let us consider a simple 1D random walk process: at each time step a walker jumps right or left with equal probability.

We are interested in finding the typical distance from the origin of a random walker after t left or right jumps? We are going to simulate many “walkers” to find this law, and we are going to do so using array computing tricks: we are going to create a 2D array with the “stories” (each walker has a story) in one direction, and the time in the other:

Step 1 : Let's initialize number of stories and max duration where we follow the walkers¶

In [105]:

n_stories=1000# number of walkerst_max=200# duration of time in which we follow the walkers

Step 2: Let's randomly choose all steps 1 or -1 of the walk¶

In [106]:

t=np.arange(t_max)steps=2*np.random.random_integers(0,1,(n_stories,t_max))-1# use instead np.random.randint(0, 1 + 1) logic

<ipython-input-106-49750b0ee743>:2: DeprecationWarning: This function is deprecated. Please call randint(0, 1 + 1) instead steps = 2 * np.random.random_integers(0, 1, (n_stories, t_max)) - 1

Step 3: Lets verify if all steps are 1 or -1¶

In [107]:

np.unique(steps)

Out[107]:

array([-1, 1])

Step 4: OK, we build the walks by summing steps along the time¶

In [108]:

positions=np.cumsum(steps,axis=1)# this is axis = 1, dimension of timesquare_dist=np.square(positions)

Step 5: Lets get the mean in the axis of the stories¶

In [109]:

mean_square_dist=np.mean(square_dist,axis=0)

Step 6: Finally lets plot the results¶

In [110]:

plt.figure(figsize=(8,6))plt.plot(t,np.sqrt(mean_square_dist),'g.',t,np.sqrt(t),'r-')plt.xlabel(r"$t$")plt.ylabel(r"$\sqrt{\langle (\delta x)^2 \rangle}$")

Out[110]:

Text(0, 0.5, '$\\sqrt{\\langle (\\delta x)^2 \\rangle}$')

In [ ]: